Document Categorizer Agent for Computer Science Academic Papers

نویسندگان

Khalifa Chekima

Patricia Anthony

چکیده

This paper presents Document Categorizer Agent that categorizes computer science academic papers in .pdf format such as journals and proceedings. In this paper, we propose the use of set of term stored in a database to categorize computer science papers. Few methods and algorithms from related work are considered in improving the categorization process. We have evaluated our document categorizer agent on a number of computer science papers. The categorization process is done by parsing the document, calculating the frequency of each term and matching the terms found with the dataset found in the database. We have shown that the use of this term database can be used to categorize documents. The categorizer agent focuses on categorizing the text document into predetermined category based on the extracted keyword. This can help in making the searching process more efficient and saves the user’s time in searching for the desired document.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of the Document Categorization in "Fixed-point Observatory"

“Fixed-point observatory” is a prototype to support users to grasp recent trends in the fields of their interest from large-scale information. It consists of content-based categorizer, named-entity-based categorizer and multiple-document summarizer. We have evaluated the content-based categorizer, which adopts the simple “bag-of-words” model. Though the quality seems be sufficient for rough cla...

متن کامل

Algorithmic Detection of Computer Generated Text

Computer generated academic papers have been used to expose a lack of thorough human review at several computer science conferences. We assess the problem of classifying such documents. After identifying and evaluating several quantifiable features of academic papers, we apply methods from machine learning to build a binary classifier. In tests with two hundred papers, the resulting classifier ...

متن کامل

Experiments with HITEC: a Hierarchical Text Categorizer

This paper presents experiments on the effectiveness of HITEC software (HIerarchical TExt Categorizer) on several natural languages (English, German) and with various kinds of text corpora. HITEC applies UFEX (Universal Feature EXtractor) method for hierarchical text categorization. Based on the obtained results shows that HITEC outperforms its known competitors on the investigated corpora, its...

متن کامل

Preparation of Papers for the IAENG International Journal of Computer Science

These instructions give you guidelines for preparing papers for the journal IAENG International Journal of Computer Science. Use this document as a template if you are using LaTeX. Motion tracking and object recognition often use cameras that are mounted in motion platforms like pantilt units, linear tables and even robots. Tracking can be automated by visually servoing the platform’s degrees-o...

متن کامل

Networks of reader and country status: an analysis of Mendeley reader statistics

The number of papers published in journals indexed by the Web of Science core collection is steadily increasing. In recent years, nearly two million new papers were published each year; somewhat more than one million papers when primary research papers are considered only (articles and reviews are the document types where primary research is usually reported or reviewed). However, who reads the...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Document Categorizer Agent for Computer Science Academic Papers

نویسندگان

چکیده

منابع مشابه

Evaluation of the Document Categorization in "Fixed-point Observatory"

Algorithmic Detection of Computer Generated Text

Experiments with HITEC: a Hierarchical Text Categorizer

Preparation of Papers for the IAENG International Journal of Computer Science

Networks of reader and country status: an analysis of Mendeley reader statistics

عنوان ژورنال:

اشتراک گذاری